Open In Colab

Case Study: Image-to-Image Translation with pix2pix

Load the dataset

In order to demonstrate the prowess of image to image translation, I will be using the cityscape dataset sourced from UC Berkeley repository. The dataset comes as a set of real images and a semantic segmented version of it, with urban streets as the main focus. The objective of the GAN model is to translate a semantic segmented image into a realistic urban street image.

Each original image is of size 256 by 512, containing two images of 256 by 256.

According to the pix2pix paper, some augmentation is done to preprocess the training set, these comes as random jittering and mirroring.

U-Net Generator

Generator Loss

Discriminator

Discriminator Loss

Optimizers and Callback Functions

Training Functions